A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3

نویسندگان

  • Sabine Dietmann
  • Jong Park
  • Cédric Notredame
  • Andreas Heger
  • Michael Lappe
  • Liisa Holm
چکیده

The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary v.3

The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank. The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to (1) supersecondary structur...

متن کامل

Touring protein fold space with Dali/FSSP

The FSSP database and its new supplement, the Dali Domain Dictionary, present a continuously updated classification of all known 3D protein structures. The classification is derived using an automatic structure alignment program (Dali) for the all-against-all comparison of structures in the Protein Data Bank. From the resulting enumeration of structural neighbours (which form a surprisingly con...

متن کامل

A consensus view of fold space: combining SCOP, CATH, and the Dali Domain Dictionary.

We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the cl...

متن کامل

Structural classification of thioredoxin-like fold proteins.

Protein structure classification is necessary to comprehend the rapidly growing structural data for better understanding of protein evolution and sequence-structure-function relationships. Thioredoxins are important proteins that ubiquitously regulate cellular redox status and various other crucial functions. We define the thioredoxin-like fold using the structure consensus of thioredoxin homol...

متن کامل

Assessment of Protein Domain Classifications: SCOP, CATH, Dali and EVEREST

Background: SCOP is a manual classification of protein domain structures. CATH is a classification of protein domain structures created through a combination of manual and automatic methods. The Dali Domain Classification (henceforth Dali) is an automatically generated classification of protein domain structures. EVEREST is an automatically generated classification of protein domains that uses ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Nucleic acids research

دوره 29 1  شماره 

صفحات  -

تاریخ انتشار 2001